Last updated: 2025-06-25
Checks: 6 1
Knit directory: casper_ss_ma/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(12345) was run prior to running the
code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.
| absolute | relative |
|---|---|
| /Volumes/scratch/DIMA/piva/casper_ss_ma/ | .. |
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 8bb180c. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish or
wflow_git_commit). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .RData
Ignored: .Rhistory
Ignored: .Rproj.user/
Untracked files:
Untracked: .DS_Store
Untracked: analysis/.DS_Store
Untracked: analysis/02_degs_go_aneuploidy_median.Rmd
Untracked: analysis/03_degs_go_CD82expr_median.Rmd
Untracked: analysis/VennDiagram.2025-06-09_13-53-40.335615.log
Untracked: analysis/VennDiagram.2025-06-09_13-54-51.029086.log
Untracked: analysis/VennDiagram.2025-06-09_13-55-15.147126.log
Untracked: analysis/VennDiagram.2025-06-09_13-56-18.122749.log
Untracked: analysis/VennDiagram.2025-06-09_13-56-30.934079.log
Untracked: analysis/VennDiagram.2025-06-09_14-18-19.412377.log
Untracked: analysis/VennDiagram.2025-06-18_10-28-53.699452.log
Untracked: analysis/VennDiagram.2025-06-18_10-37-36.77178.log
Untracked: analysis/VennDiagram.2025-06-18_11-32-36.228427.log
Untracked: analysis/VennDiagram.2025-06-18_15-38-55.387683.log
Untracked: analysis/VennDiagram.2025-06-18_15-48-17.579371.log
Untracked: analysis/VennDiagram.2025-06-18_17-18-17.268774.log
Untracked: analysis/VennDiagram.2025-06-19_11-11-17.376961.log
Untracked: analysis/VennDiagram.2025-06-19_14-52-46.049026.log
Untracked: analysis/VennDiagram.2025-06-19_16-40-05.861139.log
Untracked: analysis/VennDiagram.2025-06-19_16-40-07.33202.log
Untracked: analysis/VennDiagram.2025-06-19_16-40-08.673023.log
Untracked: analysis/VennDiagram.2025-06-19_17-50-05.238063.log
Untracked: analysis/VennDiagram.2025-06-19_17-50-07.22979.log
Untracked: analysis/VennDiagram.2025-06-19_17-50-09.007028.log
Untracked: analysis/VennDiagram.2025-06-19_18-48-01.885712.log
Untracked: analysis/VennDiagram.2025-06-19_18-48-03.579702.log
Untracked: analysis/VennDiagram.2025-06-19_18-48-04.898695.log
Untracked: analysis/VennDiagram.2025-06-20_10-18-23.300456.log
Untracked: analysis/VennDiagram.2025-06-20_10-18-24.588109.log
Untracked: analysis/VennDiagram.2025-06-20_10-18-26.077856.log
Untracked: analysis/VennDiagram.2025-06-20_10-50-54.081682.log
Untracked: analysis/VennDiagram.2025-06-20_10-50-55.516535.log
Untracked: analysis/VennDiagram.2025-06-20_10-50-56.913582.log
Untracked: analysis/VennDiagram.2025-06-20_11-10-43.68944.log
Untracked: analysis/VennDiagram.2025-06-20_11-10-45.681514.log
Untracked: analysis/VennDiagram.2025-06-20_11-10-47.126222.log
Untracked: analysis/VennDiagram.2025-06-20_12-19-10.326514.log
Untracked: analysis/VennDiagram.2025-06-20_12-19-11.75991.log
Untracked: analysis/VennDiagram.2025-06-20_12-19-13.198666.log
Untracked: analysis/VennDiagram.2025-06-20_12-29-09.447741.log
Untracked: analysis/VennDiagram.2025-06-20_12-29-11.214146.log
Untracked: analysis/VennDiagram.2025-06-20_12-29-12.791818.log
Untracked: analysis/VennDiagram.2025-06-20_12-44-02.971891.log
Untracked: analysis/VennDiagram.2025-06-20_12-44-04.709094.log
Untracked: analysis/VennDiagram.2025-06-20_12-44-06.321173.log
Untracked: analysis/VennDiagram.2025-06-24_15-54-45.065538.log
Untracked: analysis/VennDiagram.2025-06-24_15-54-48.303942.log
Untracked: analysis/VennDiagram.2025-06-24_15-54-50.098014.log
Untracked: analysis/VennDiagram.2025-06-25_11-53-49.958809.log
Untracked: analysis/VennDiagram.2025-06-25_11-53-51.64026.log
Untracked: analysis/VennDiagram.2025-06-25_11-53-53.29465.log
Untracked: analysis/VennDiagram.2025-06-25_15-09-09.87969.log
Untracked: analysis/VennDiagram.2025-06-25_15-09-14.193409.log
Untracked: analysis/VennDiagram.2025-06-25_15-09-17.485413.log
Untracked: analysis/VennDiagram.2025-06-25_15-34-29.722117.log
Untracked: analysis/VennDiagram.2025-06-25_15-34-31.791802.log
Untracked: analysis/VennDiagram.2025-06-25_15-34-34.21193.log
Untracked: analysis/hsa04064.HLT-HighAS_vs_HLT-LowAS.png
Untracked: analysis/hsa04064.HLT-HighCD82_vs_HLT-LowCD82.png
Untracked: analysis/hsa04064.HRplus-HighAS_vs_HRplus-LowAS.png
Untracked: analysis/hsa04064.HRplus-HighCD82_vs_HRplus-LowCD82.png
Untracked: analysis/hsa04064.HRplus_vs_HLT.png
Untracked: analysis/hsa04064.TNBC-HighAS_vs_TNBC-LowAS.png
Untracked: analysis/hsa04064.TNBC-HighCD82_vs_TNBC-LowCD82.png
Untracked: analysis/hsa04064.TNBC_vs_HLT.png
Untracked: analysis/hsa04064.TNBC_vs_HRplus.png
Untracked: analysis/hsa04064.png
Untracked: analysis/hsa04064.xml
Untracked: code/
Untracked: data/
Untracked: degs_HLT-HighAS_vs_HLT-LowAS.csv
Untracked: degs_HRplus-HighAS_vs_HRplus-LowAS.csv
Untracked: output/
Unstaged changes:
Modified: analysis/00_casper_analysis.Rmd
Deleted: analysis/02_deconvolution.Rmd
Modified: casper_ss_ma.Rproj
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/03_degs_go_CD82expr.Rmd)
and HTML (docs/03_degs_go_CD82expr.html) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote),
click on the hyperlinks in the table below to view the files as they
were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| Rmd | 8bb180c | annamariapiva | 2025-06-25 | updated notebooks 01, 02, 03 and 04 |
| html | a039b3f | annamariapiva | 2025-06-20 | Build site. |
| Rmd | f0e862c | annamariapiva | 2025-06-20 | new reports |
The goal of this analysis is to identify which pathways are up- or down-regulated in samples with high or low levels of CD82 expression. For each condition (Healthy, HR+, and TNBC), patients are divided into high and low CD82 gene expression groups using the median expression value as a cutoff. The following comparisons:
HR+ High-CD82 vs HR+ Low-CD82
TNBC High-CD82 vs TNBC Low-CD82
Healthy High-CD82 vs Healthy Low-CD82
The input for the following analysis is:
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
The first steps to start the analysis in R is to load the packages required for the analysis, load the input data mentioned above and establish the thresholds for the analysis:
To classify samples into High and Low CD82 expression groups, we examined the distribution of CD82 expression across all samples from the three conditions.
In the distribution plots:
The blue line indicates the median CD82 expression of the displayed samples.
The red line indicate the median CD82 expression of all the samples (the cutoff used).




Differential expression analysis is performed using a custom function, which accounts for batch effect. A batch effect occurs when non-biological factors, like laboratory conditions or instruments used, in an experiment cause changes in the data produced by the experiment. Lowly expressed genes are removed to reduce noise. Lowly expressed genes are here considered as:
Let’s have a look at PCA, and gene expression pattern across samples. The batch effect has been considered in the design, but has not been corrected for this plot.
Here is the PCA of selected sample from the first comparison.
Genes are annotated as significant or not, to distinguish between genes showing meaningful changes, that is having an adjusted p-value below the threshold considered above and an absolute log2FoldChange greater than the cutoff considered above.
Given the significant genes, among the differentially expressed genes previously computed, let’s visualize the top20 and all the DE genes.
Meaning of Colors

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |
Further analysis is done through gene set enrichment analysis, which does not exclude genes based on logfc or adjusted p-value, as done previously. GSEA is performed separately on each subontology: biological processes (BP), cellular components (CC) and molecular functions (MF). The dot plot below shows the top 10 most enriched GO terms. The size of each dot correlates with the count of differentially expressed genes associated with each GO term. Furthermore, the color of each dot reflects the significance of the enrichment of the respective GO term, highlighting its relative importance.



To identify biologically meaningful patterns of gene expression, we performed Gene Set Enrichment Analysis (GSEA) using the MSigDB Hallmark gene sets, which summarize well-defined biological states or processes. Genes were ranked by Log2 Fold Change. Significantly enriched pathways were identified based on normalized enrichment score (NES) and adjusted p-values (FDR) (p.adj < 0.05). Positively enriched pathways are upregulated in the first group, while negatively enriched pathways indicate suppression.

quartz_off_screen
2
[1] "Note: 4434 of 14384 unique input IDs unmapped."
[1] "Note: 4434 of 14384 unique input IDs unmapped."
[1] "Note: 4434 of 14384 unique input IDs unmapped."
Here is the PCA of selected sample from the second comparison.
Genes are annotated as significant or not, to distinguish between genes showing meaningful changes, that is having an adjusted p-value below the threshold considered above and an absolute log2FoldChange greater than the cutoff considered above.
Given the significant genes, among the differentially expressed genes previously computed, let’s visualize the top20 and all the DE genes.
Meaning of Colors

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |




quartz_off_screen
2
Here is the PCA of selected sample from the third comparison.
Genes are annotated as significant or not, to distinguish between genes showing meaningful changes, that is having an adjusted p-value below the threshold considered above and an absolute log2FoldChange greater than the cutoff considered above.
Given the significant genes, among the differentially expressed genes previously computed, let’s visualize the top20 and all the DE genes.
Meaning of Colors

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |

| Version | Author | Date |
|---|---|---|
| a039b3f | annamariapiva | 2025-06-20 |




quartz_off_screen
2
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.4.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Rome
tzcode source: internal
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] VennDiagram_1.7.3 futile.logger_1.4.3
[3] pathview_1.40.0 tibble_3.3.0
[5] fgsea_1.26.0 msigdbr_24.1.0
[7] gridExtra_2.3 dplyr_1.1.4
[9] clusterProfiler_4.8.2 plotly_4.10.4
[11] reshape_0.8.9 ggplot2_3.5.2
[13] gplots_3.2.0 RColorBrewer_1.1-3
[15] ComplexHeatmap_2.16.0 rtracklayer_1.60.1
[17] DESeq2_1.40.2 SummarizedExperiment_1.30.2
[19] Biobase_2.60.0 MatrixGenerics_1.12.3
[21] matrixStats_1.5.0 GenomicRanges_1.52.1
[23] GenomeInfoDb_1.36.4 IRanges_2.34.1
[25] S4Vectors_0.38.2 BiocGenerics_0.46.0
[27] DT_0.33
loaded via a namespace (and not attached):
[1] splines_4.3.1 later_1.4.2 BiocIO_1.10.0
[4] bitops_1.0-9 ggplotify_0.1.2 polyclip_1.10-7
[7] graph_1.78.0 XML_3.99-0.18 lifecycle_1.0.4
[10] doParallel_1.0.17 rprojroot_2.0.4 lattice_0.22-7
[13] MASS_7.3-60 crosstalk_1.2.1 magrittr_2.0.3
[16] sass_0.4.10 rmarkdown_2.29 jquerylib_0.1.4
[19] yaml_2.3.10 httpuv_1.6.16 cowplot_1.1.3
[22] DBI_1.2.3 abind_1.4-8 zlibbioc_1.46.0
[25] purrr_1.0.4 ggraph_2.2.1 RCurl_1.98-1.17
[28] yulab.utils_0.2.0 tweenr_2.0.3 git2r_0.36.2
[31] circlize_0.4.16 GenomeInfoDbData_1.2.10 enrichplot_1.20.0
[34] ggrepel_0.9.6 tidytree_0.4.6 codetools_0.2-20
[37] DelayedArray_0.26.7 DOSE_3.26.2 ggforce_0.4.2
[40] tidyselect_1.2.1 shape_1.4.6.1 aplot_0.2.5
[43] farver_2.1.2 viridis_0.6.5 GenomicAlignments_1.36.0
[46] jsonlite_2.0.0 GetoptLong_1.0.5 tidygraph_1.3.1
[49] iterators_1.0.14 foreach_1.5.2 tools_4.3.1
[52] treeio_1.24.3 Rcpp_1.0.14 glue_1.8.0
[55] xfun_0.52 qvalue_2.32.0 withr_3.0.2
[58] formatR_1.14 fastmap_1.2.0 caTools_1.18.3
[61] digest_0.6.37 R6_2.6.1 gridGraphics_0.5-1
[64] colorspace_2.1-1 GO.db_3.17.0 gtools_3.9.5
[67] RSQLite_2.4.1 tidyr_1.3.1 generics_0.1.4
[70] data.table_1.17.6 graphlayouts_1.2.2 httr_1.4.7
[73] htmlwidgets_1.6.4 S4Arrays_1.0.6 scatterpie_0.2.4
[76] whisker_0.4.1 pkgconfig_2.0.3 gtable_0.3.6
[79] blob_1.2.4 workflowr_1.7.1 XVector_0.40.0
[82] shadowtext_0.1.4 htmltools_0.5.8.1 clue_0.3-66
[85] scales_1.4.0 png_0.1-8 ggfun_0.1.8
[88] lambda.r_1.2.4 knitr_1.50 rstudioapi_0.17.1
[91] reshape2_1.4.4 rjson_0.2.23 nlme_3.1-168
[94] curl_6.3.0 org.Hs.eg.db_3.17.0 cachem_1.1.0
[97] GlobalOptions_0.1.2 stringr_1.5.1 KernSmooth_2.23-26
[100] parallel_4.3.1 HDO.db_0.99.1 AnnotationDbi_1.62.2
[103] restfulr_0.0.15 pillar_1.10.2 vctrs_0.6.5
[106] promises_1.3.3 cluster_2.1.8.1 Rgraphviz_2.44.0
[109] evaluate_1.0.4 KEGGgraph_1.60.0 cli_3.6.5
[112] locfit_1.5-9.12 compiler_4.3.1 futile.options_1.0.1
[115] Rsamtools_2.16.0 rlang_1.1.6 crayon_1.5.3
[118] labeling_0.4.3 plyr_1.8.9 fs_1.6.6
[121] stringi_1.8.7 viridisLite_0.4.2 BiocParallel_1.34.2
[124] assertthat_0.2.1 babelgene_22.9 Biostrings_2.68.1
[127] lazyeval_0.2.2 GOSemSim_2.26.1 Matrix_1.6-4
[130] patchwork_1.3.0 bit64_4.6.0-1 KEGGREST_1.40.1
[133] igraph_2.1.4 memoise_2.0.1 bslib_0.9.0
[136] ggtree_3.8.2 fastmatch_1.1-6 bit_4.6.0
[139] downloader_0.4.1 ape_5.8-1 gson_0.1.0